Balancing the Communication Load of Asynchronously Parallelized Machine Learning Algorithms

نویسندگان

  • Janis Keuper
  • Franz-Josef Pfreundt
چکیده

Stochastic Gradient Descent (SGD) is the standard numerical method used to solve the core optimization problem for the vast majority of machine learning (ML) algorithms. In the context of large scale learning, as utilized by many Big Data applications, efficient parallelization of SGD is in the focus of active research. Recently, we were able to show that the asynchronous communication paradigm can be applied to achieve a fast and scalable parallelization of SGD. Asynchronous Stochastic Gradient Descent (ASGD) outperforms other, mostly MapReduce based, parallel algorithms solving large scale machine learning problems. In this paper, we investigate the impact of asynchronous communication frequency and message size on the performance of ASGD applied to large scale ML on HTC cluster and cloud environments. We introduce a novel algorithm for the automatic balancing of the asynchronous communication load, which allows to adapt ASGD to changing network bandwidths and latencies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine Learning Approach to Tuning Distributed Operating System Load Balancing Algorithms

This work concerns the use of machine learning techniques (genetic algorithms) to optimize load balancing policies in the openMosix distributed operating system. Parameters/alternative algorithms in the openMosix kernel were dynamically altered/selected based on the results of a genetic algorithm fitness function. In this fashion optimal parameter settings and algorithms choices were sought for...

متن کامل

Load Balancing in Data Warehouse – Evolution and Perspectives

The problem of load balancing is one of the crucial features in distributed data warehouse systems. In this article original load balancing algorithms are presented. The Adaptive Load Balancing Algorithms for Queries (ALBQ) and the algorithms that use grammars and learning machines in managing the ETL process. These two algorithms base the load balancing on queries analysis, however the methods...

متن کامل

Online Distribution and Load Balancing Optimization Using the Robin Hood and Johnson Hybrid Algorithm

Proper planning of assembly lines is one of the production managers’ concerns at the tactical level so that it would be possible to use the machine capacity, reduce operating costs and deliver customer orders on time. The lack of an efficient method in balancing assembly line can create threatening problems for manufacturing organizations. The use of assembly line balancing methods cannot balan...

متن کامل

Adaptive load balancing of parallel applications with multi-agent reinforcement learning on heterogeneous systems

We report on the improvements that can be achieved by applying machine learning techniques, in particular reinforcement learning, for the dynamic load balancing of parallel applications. The applications being considered here are coarse grain data intensive applications. Such applications put high pressure on the interconnect of the hardware. Synchronization and load balancing in complex, heter...

متن کامل

Application of Parallelized Apriori in Grid Computing Environment

The goal of the strategy is to improve the performance of distributed algorithms and better their responsiveness. The association rule mining algorithms has high computational complexity due to the size of its search space and the high demands of data access. The work aims at mining the data in a grid computing environment, which computes by distributing the data to its clusters and mines it in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1510.01155  شماره 

صفحات  -

تاریخ انتشار 2015